Controllable Text Simplification Using Lexically Constrained Decoding Based on Edit Operation Prediction

نویسندگان

چکیده

テキスト平易化の難易度制御は,目標難易度に応じて文を平易化することで,言語学習支援に貢献する技術である.このタスクに対する既存手法には,入力を大幅に言い換える学習が困難である問題と柔軟な文生成が難しい問題がある.提案手法では,平易な出力文に出現させる単語の制約と,出現させない単語の制約を作成し,それらによって難易度を制御しつつテキスト平易化を行う.制約は,文中の各単語に対する編集操作予測,難易度判定,難解な単語の平易な言い換えにより作成する.提案手法は,正・負の制約を用いることで言い換えを促進しつつ,系列変換モデルで柔軟に文を生成するため,既存手法の問題を解決できる.評価実験によって,提案手法が文法性を損なったり,文の意味を大幅に欠落させることなく目標とする難易度に応じたテキスト平易化を実現できることを確認した.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search

We present Grid Beam Search (GBS), an algorithm which extends beam search to allow the inclusion of pre-specified lexical constraints. The algorithm can be used with any model that generates a sequence ŷ = {y0 . . . yT }, by maximizing p(y|x) = ∏ t p(yt|x; {y0 . . . yt−1}). Lexical constraints take the form of phrases or words that must be present in the output sequence. This is a very general ...

متن کامل

Mesh Surface Simplification Based on Constrained Energy

A triangle-mesh-surface-simplification algorithm based on a constrained energy function is proposed. In the algorithm, vertices are deleted iteratively according to the dimension-less energy function, which represents the cost of surface modification by vertex deletion and subsequent hole re-triangulation. The energy function is used both for selecting a vertex with the minimum energy for delet...

متن کامل

Constrained Decoding for Text-Level Discourse Parsing

This paper presents a novel approach to document-based discourse analysis by performing a global A* search over the space of possible structures while optimizing a global criterion over the set of potential coherence relations. Existing approaches to discourse analysis have so far relied on greedy search strategies or restricted themselves to sentence-level discourse parsing. Another advantage ...

متن کامل

Evaluating Text Segmentation using Boundary Edit Distance

This work proposes a new segmentation evaluation metric, named boundary similarity (B), an inter-coder agreement coefficient adaptation, and a confusion-matrix for segmentation that are all based upon an adaptation of the boundary edit distance in Fournier and Inkpen (2012). Existing segmentation metrics such as Pk, WindowDiff, and Segmentation Similarity (S) are all able to award partial credi...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Shizen gengo shori

سال: 2023

ISSN: ['1340-7619', '2185-8314']

DOI: https://doi.org/10.5715/jnlp.30.991